23 research outputs found

    Multi-camera reconstruction and rendering for free-viewpoint video

    Get PDF
    While virtual environments in interactive entertainment become more and more lifelike and sophisticated, traditional media like television and video have not yet embraced the new possibilities provided by the rapidly advancing processing power. In particular, they remain as non-interactive as ever, and do not allow the viewer to change the camera perspective to his liking. The goal of this work is to advance in this direction, and provide essential ingredients for a free-viewpoint video system, where the viewpoint can be chosen interactively during playback. Knowledge of scene geometry is required to synthesize novel views. Therefore, we describe 3D reconstruction methods for two distinct kinds of camera setups. The first one is depth reconstruction for camera arrays with parallel optical axes, the second one surface reconstruction, in the case that the cameras are distributed around the scene. Another vital part of a 3D video system is the interactive rendering from different viewpoints, which has to perform in real-time. We cover this topic in the last part of this thesis.Während die virtuellen Welten in interaktiven Unterhaltungsmedien immer realitätsnäher werden, machen traditionellere Medien wie Fernsehen und Video von den neuen Möglichkeiten der rasant wachsenden Rechenkapazität bisher kaum Gebrauch. Insbesondere mangelt es ihnen immer noch an Interaktivität, und sie erlauben dem Konsumenten nicht, elementare Parameter wie zum Beispiel die Kameraperspektive seinen Wünschen anzupassen. Ziel dieser Arbeit ist es, die Entwicklung in diese Richtung voranzubringen und essentielle Bausteine für ein Videosystem bereitzustellen, bei dem der Blickpunkt während der Wiedergabe jederzeit völlig frei gewählt werden kann. Um neue Ansichten synthetisieren zu können, ist zunächst Kenntnis von der 3D Geometrie der Szene notwendig. Wir entwickeln daher Rekonstruktionsalgorithmen für zwei verschiedene Anordnungen von Kameras. Falls die Kameras eng beieinanderliegen und parallele optische Achsen haben, können lediglich Tiefenkarten geschätzt werden. Sind die Kameras jedoch im einer Halbkugel um die Szene herum montiert, so rekonstruieren wir sogar echte Oberflächengeometrie. Ein weiterer wichtiger Aspekt ist die interaktive Darstellung der Szene aus neuen Blickwinkeln, die wir im letzten Teil der Arbeit in Angriff nehmen

    Nichtkonforme Finite Elemente und Kollokation für elliptische Randwertprobleme

    No full text
    Unter Zuhilfenahme von Techniken zu finiten Elementen mit 'Variational Crimes' aus der Arbeit [B] von Klaus Böhmer wird eine Konvergenztheorie für eine Klasse nichtkonformer Diskretisierungen elliptischer Randwertprobleme erarbeitet. Diese zeichnet sich dadurchaus, daß die approximierenden Funktionen zwar im Inneren der finiten Elemente glatt sind, auf Übergängen zwischen zwei Elementen jedoch nur in endlichen vielen Punkten stetig mit stetiger Normalableitung sein üssen. Zu dieser Klasse zählt insbesondere ein effizientes Kollokationsverfahren, welches von Eusebius Doedel 1997 in [D] vorgestellt worden ist, für das die Konvergenz bisher aber noch nicht sichergestellt war. Einige numerische Beispielrechnungen mit einem eigens entwickelten Programmpaket illustrieren die theoretischen Resultate

    Real-time, Free-viewpoint Video Rendering from Volumetric Geometry

    No full text
    The aim of this work is to render high-quality views of a dynamic scene from novel viewpoints in real-time. An online system available at our institute computes the visual hull as a geometry proxy to guide the rendering at interactive rates. Because only a sparse set of cameras distributed around the scene is used to record the scene, only an coarse model of the scene geometry can be recovered. To alleviate this problem, we render textured billboards defined by the voxel model of the visual hull, preserving details in the source images while achieving excellent performance. By exploiting multi-texturing capabilities of modern graphics hardware, real-time frame rates are attained. Our algorithm can be used as part of an inexpensive system to display 3D-videos, or ultimately even in live 3D-television. The user is able to watch the scene from an arbitrary viewpoint chosen interactively

    Spacetime-coherent Geometry Reconstruction from Multiple Video Streams

    No full text
    By reconstructing time-varying geometry one frame at a time, one ignores the continuity of natural motion, wasting useful information about the underlying video-image formation process and taking into account temporally discontinuous reconstruction results. In 4D spacetime, the surface of a dynamic object describes a continuous 3D hyper-surface. This hyper-surface can be implicitly defined as the minimum of an energy functional designed to optimize photo-consistency. Based on an Euler-Lagrange reformulation of the problem, we find this hyper-surface from a handful of synchronized video recordings. The resulting object geometry varies smoothly over time, and intermittently invisible object regions are correctly interpolated from previously and/or future frames

    Real-Time Microfacet Billboarding For Free-Viewpoint Video Rendering

    No full text
    We present a hardware-accelerated method for video-based rendering relying on an approximate model of scene geometry. Our goal is to render high-quality views of the scene from arbitrary viewpoints in real-time, using as input synchronized video-streams from only a small number of calibrated cameras distributed around the scene. Despite the fact that only a very coarse geometry reconstruction is possible, our rendering approach based on textured billboards preserves the details present in the source images. By exploiting multi-texturing and blending facilities of modern graphics cards, we achieve real-time frame-rates on current off-the-shelf hardware. One of the possible applications of our algorithm is to use it in an inexpensive system to display 3D-videos, which the user can watch from an interactively chosen viewpoint, or ultimately even live 3D-television

    Hardware-accelerated Dynamic Light Field Rendering

    No full text
    We present a system capable of interactively displaying a dynamic scene from novel viewpoints by warping and blending images recorded from multiple synchronized video cameras. It is tuned for streamed data and achieves 20 frames per second on modern consumer-class hardware when rendering a 3D movie from an arbitrary eye point within the convex hull of the recording camera's positions. The quality of the prediction largely depends on the accuracy of the disparity maps which are reconstructed off-line and provided together with the images. We generalize known algorithms for estimating disparities between two images to the case of multiple image streams, aiming at a minimization of warping artifacts and utilization of temporal coherence

    Joint 3D-Reconstruction and Background Separation in Multiple Views using Graph Cuts

    No full text
    This paper deals with simultaneous depth map estimation and background separation in a multi-view setting with several fixed calibrated cameras, two problems which have previously been addressed separately. We demonstrate that their strong interdependency can be exploited elegantly by minimizing a discrete energy functional which evaluates both properties at the same time. Our algorithm is derived from the powerful "Multi-Camera Scene Reconstruction via Graph Cuts" algorithm recently presented by Kolmogorov and Zabih. Experiments with both real-world as well as synthetic scenes demonstrate that the presented combined approach yields even more correct depth estimates. In particular, the additional information gained by taking background into account increases considerably the algorithm's robustness against noise

    Effective Aesthetics Prediction with Multi-level Spatially Pooled Features

    No full text
    We propose an effective deep learning approach to aesthetics quality assessment that relies on a new type of pre-trained features, and apply it to the AVA data set, the currently largest aesthetics database. While previous approaches miss some of the information in the original images, due to taking small crops, down-scaling or warping the originals during training, we propose the first method that efficiently supports full resolution images as an input, and can be trained on variable input sizes. This allows us to significantly improve upon the state of the art, increasing the Spearman rank-order correlation coefficient (SRCC) of ground-truth mean opinion scores (MOS) from the existing best reported of 0.612 to 0.756. To achieve this performance, we extract multi-level spatially pooled (MLSP) features from all convolutional blocks of a pre-trained InceptionResNet-v2 network, and train a custom shallow Convolutional Neural Network (CNN) architecture on these new features.publishe

    Abstract Hardware-accelerated Dynamic Light Field Rendering

    No full text
    We present a system capable of interactively displaying a dynamic scene from novel viewpoints by warping and blending images recorded from multiple synchronized video cameras. It is tuned for streamed data and achieves 20 frames per second on modern consumer-class hardware when rendering a 3D movie from an arbitrary eye point within the convex hull of the recording camera’s positions. The quality of the prediction largely depends on the accuracy of the disparity maps which are reconstructed off-line and provided together with the images. We generalize known algorithms for estimating disparities between two images to the case of multiple image streams, aiming at a minimization of warping artifacts and utilization of temporal coherence.
    corecore